GCP DevOps Engineer

Analytical Services Campbell, California


Description

We are looking for a Senior GCP DevOps Engineer with a deep architectural mindset to design and manage our global cloud footprint. This role isn't just about managing tools; it’s about building a resilient, high-availability platform that spans multiple regions and zones, ensuring our services are always-on and lightning-fast for a global user base.

Key Responsibilities

1. High-Availability Architecture

  • Multi-Region Strategy: Design and implement resilient architectures across multiple GCP regions and availability zones to ensure 99.99% uptime and robust disaster recovery.
  • Traffic Management: Deploy and manage Global Cloud Load Balancing (GCLB) and Cloud DNS to optimize traffic flow and minimize latency.
  • Database Reliability: Architect distributed database solutions (e.g., Cloud Spanner, Multi-region Cloud SQL) to maintain data consistency and availability.

2. Core DevOps & Automation

  • CI/CD Leadership: Build and optimize sophisticated deployment pipelines using Cloud Build, GitLab CI, or GitHub Actions, focusing on "canary" and "blue-green" deployment patterns.
  • Infrastructure as Code (IaC): Standardize all infrastructure via Terraform, utilizing modular designs to ensure consistency across dev, staging, and production environments.
  • Configuration Management: Manage environment-specific configurations and secrets using Secret Manager and Config Controller.

3. Performance & Scalability

  • Fleet Management: Oversee large-scale Google Kubernetes Engine (GKE) clusters, implementing Multi-cluster Ingress and Anthos for cross-region workload orchestration.
  • Auto-scaling & Efficiency: Develop custom scaling metrics to ensure the platform expands seamlessly during peak loads and contracts during idle periods to maintain efficiency.

Required Technical Profile

  • Architectural Depth: Extensive experience with GCP Network design, including Shared VPCs, Cloud Interconnect, and VPC Peering.
  • Containerization: Mastery of Docker and Kubernetes (GKE), specifically in multi-cluster or multi-region configurations.
  • Automation: Expert-level proficiency in Terraform (specifically building reusable modules).
  • Resilience Engineering: Proven track record of conducting "Chaos Engineering" or DR drills to test system durability.